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Abstract 

(D We generalize the Fortuin-Kasteleyn (FK) cluster representation of the parti- 

tion function of the Ising model to represent the partition function of quantum 
spin models with an arbitrary spin magnitude in arbitrary dimensions. This 
generalized representation enables us to develop a new cluster algorithm for 
the simulation of quantum spin systems by the worldline Monte Carlo method. 
Because the Swendsen-Wang algorithm is based on the FK representation, the 
new cluster algorithm naturally includes it as a special case. As well as the 
general description of the new representation, we present an illustration of our 
new algorithm for some special interesting cases: the Ising model, the antifer- 
-p-; romagnetic Heisenberg model with 5=1, and a general Heisenberg model. 

The new algorithm is applicable to models with any range of the exchange 

G interaction, any lattice geometry, and any dimensions. 

KEYWORDS Quantum Monte Carlo, Cluster Algorithm, XXZ Model, 

O Heisenberg Model, XY Model 
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I. INTRODUCTION 



In 1987, Swendsen and Wang [1] used the Fortuin and Kasteleyn (FK) representation 
[2] of the partition function of Ising models in a Monte Carlo simulation of these models. 
With the use of this representation, they were able to produce an algorithm whose key 
feature was the global updating of Ising spin configurations in contrast to local updating in 
the standard Metropolis algorithm. With global updating, their cluster algorithm greatly 
reduced the autocorrelation times in the simulation near a critical point. Since then, several 
attempts have been made [3] to reduce autocorrelation times of various systems by various 
forms of cluster algorithms, and recently the construction of a cluster algorithm has been 
formulated on more general grounds [4-7]. Still, most applications of cluster algorithms have 
been restricted to classical models. 

In this paper, we discuss cluster algorithms for worldline Monte Carlo (WLMC) simu- 
lations of general classes of quantum spin systems. For most quantum spin systems, exact 
knowledge of the properties of the systems is very restricted. To obtain information of 
a wider range, one often resorts to numerical methods such as exact diagonalization, ex- 
pansions with respect to small parameters, and quantum Monte Carlo methods. Among 
these, only quantum Monte Carlo methods are available for relatively large systems. Sev- 
eral variants of the quantum Monte Carlo simulation exist: Green's function Monte Carlo 
(GFMC) method [8], the projector Monte Carlo method [9], the auxiliary-field method based 
on Hubbard- Stratonovich transformation [10], Handscomb's method [11], and the worldline 
Monte Carlo (WLMC) method based on the Suzuki-Trotter (ST) approximation [12]. For 
the study of ground state properties, the GFMC is particularly useful. The auxiliary-field 
method is powerful for Hubbard models and related problems. Handscomb's method lacks 
the systematic error caused by the ST approximation. The WLMC has enjoyed a wide range 
of applicability to fermion, boson, and quantum spin models. For various models, it is the 
simplest and perhaps the most widely known. 

For fermion problems and frustrated spin systems, the main difficulty shared by all 
the above-mentioned methods is the well-known negative sign problem. In some special 
cases, one of the methods can be particularly useful compared to others, as is the case 
for the auxiliary field method applied to the Hubbard model with particle-hole symmetry. 
However, for many other models, such as frustrated spin models, the WLMC is commonly 
used because no other method is particularly efficient in reducing the difficulty. In this 
paper, we do not address the sign problem. We will rather focus on another difficulty which 
has been under-appreciated. The WLMC suffers from long autocorrelation times even when 
the system is not near a finite-temperature critical point [13]. Although a similar problem 
may exist for the other quantum Monte Carlo techniques, this problem has not yet been 
studied systematically. One way to overcome this difficulty is to develop a cluster updating 
as in the SW method. In this paper we present the details of such a method. 

In general, it is non-trivial to find a cluster algorithm. The generalized approaches [4-7], 
however, provide a starting point. They first require the specification of a proper set of 
local graphs by which the whole system is decomposed into clusters and also require a 
non-negative solution to a system of linear equations (weight equations) that is often under- 
determined. Little a priori guidance is given on the construction of these graphs, and even 
the existence of a non-negative solution is not guaranteed. Nevertheless, solutions exist in 
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some simple cases. The Swendsen-Wang (SW) algorithm, for instance, is one such case. 
Here, the number of weight equations is only two as is the number of independent variables. 
A slightly more complicated case is the loop algorithm [14,15] for the six-vertex model. In 
the massless case, for example, both the number of equations and the number of variables 
are three. This algorithm was successfully applied in a WLMC simulation of the spin 1/2 
antiferromagnetic Heisenberg model [16] because the S = 1/2 quantum spin systems can be 
mapped to the six- vertex model by using ST approximation [12]. This particular algorithm 
can also be viewed as the simplest example of the general method that we present in this 
paper. 

For a general quantum spin system, the number of the weight equations and independent 
variables can be very large. As we will see below, even in the next simplest case, the case 
of the Xy-like XX Z model, is already somewhat difficult to handle as the number of the 
equations is 7 whereas the number of the variables are 11, Here, it does not seem guaranteed 
that a meaningful solution exists. In fact, however, at least one meaningful solution exists 
for any system regardless of the magnitude of spins or the coupling constants. This rather 
surprising result is what we will present in this paper as well as a practical method for 
obtaining the solution. Instead of working on the complicated weight equation itself, we will 
take another approach which naturally leads to a proper choice of graphs and a solution of 
the equations; that is, we will generalize the FK cluster representation of the Boltzmann 
weight of Ising models to the quantum models. We thereby propose a new algorithm which 
potentially reduces the autocorrelation times greatly. We have already applied the new 
algorithm to the S = 1 antiferromagnetic Heisenberg chain [17], achieving a reduction of 
some autocorrelation times by as much as four orders of magnitude. These algorithms 
differ from previous attempts [18] to generalize the FK transformation in that they produce 
clusters whose states can naturally be specified by a single variable each and can change 
independently. In this paper, we present the details of the construction of the algorithm. 

It is useful to outline the algorithm before the detailed explanations and the mathematical 
proofs. The lattice we will work on is not the original lattice on which the quantum problem 
is defined. Instead, we will consider many lattices, each of which is geometrically equivalent 
to the original lattice, and take this set of lattices as a hyper-lattice which has dimensions 
greater than the original dimensions by one. This hyper-lattice can also be viewed as a 
collection of local units which we call plaquettes and links. One Monte Carlo step of the 
new algorithm consists of a labeling process applied to the plaquettes and a flipping process 
for clusters formed by these labels. In the labeling process, we assign a label, i.e., a graph, 
stochastically to each local unit. More specifically, in each local unit, we connect vertices 
by edges which are chosen probabilistically. After this graph assignment is done for all 
plaquettes, the union of all local graphs forms a global graph defined for the entire lattice: 
the connected vertices form clusters and these clusters constitute the global graph. In the 
flipping process, each cluster is flipped randomly with probability 1/2. The question to be 
answered is how do we obtain both the set of local graphs and the labeling probabilities. 

In this paper, we will develop various notations and definitions needed to describe our 
algorithm without ambiguity. To this end, in Section II, we first summarize the general 
framework of cluster representation and cluster algorithm, and then we illustrate how a 
particular problem fits into this general framework by presenting two examples, the SW 
algorithm for the Ising model and the S = 1 antiferromagnetic Heisenberg chain. These 
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examples will not only illustrate the notation but also give a peek at the generalization of 
the FK representation and the Monte Carlo algorithms that we will derive from it. After 
these examples, formal definitions for the words introduced in the examples will be given, 
in Section III. In this section, we also present, without derivation, the new cluster represen- 
tation for the general XXZ models. Section IV is the essential part of this paper. In this 
section, we will show how to obtain a solution of the weight equation for the general XXZ 
spin model. We also propose a practical method for computing the solution. In Section V, 
we present the compact formula for the solution of the weight equation for the Heisenberg 
models. In Section VI, the ergodicity of the new algorithm is discussed. Section VII is a 
brief summary. We also discuss a further generalization of our results to the XY Z models 
in the appendix. 

II. FRAMEWORK AND EXAMPLES 

In this section, we first show how the FK-type representation of the partition function is 
used in a Monte Carlo simulation and then present two pedagogical examples to introduce 
symbols and notation which we will use in the subsequent sections to generalize the FK 
representation and to develop the new algorithms. 

A. The cluster representation 

In general, the partition function of a lattice spin system is the sum of some weight 
function over all possible 'spin configurations' 

Z = J2W(n), (2.1) 
n 

where 

n = (ni,n 2 , • • • ,n Nv ) (2.2) 

denotes a set of one-bit variables each of which takes a value or 1. This type of configura- 
tional summation is the case even with quantum spin systems when they are mapped by the 
ST decomposition to problems with classical degrees of freedom. Here, we are considering 
N v vertices which are lattice points in the case of Ising model but will not necessarily be 
lattice points in the more general cases to be discussed below. In the subsequent sections, 
we will show that this weight function W(n) itself can be expressed as a sum of another 
weight function over a variable different from the 'spin configurations' n, 

W(n) = J2W(n,G). (2.3) 

G 

This new variable G is a graph defined on the lattice. The equation (2.1) with (2.3) can be 
viewed as a partition function of a system that consists of vertex variables n and graphs 
G interacting each other. A graph consists of edges each of which connects two vertices. It 
imposes a strong restriction on the values that the variable n can take because the weight 
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function W(n, G) will be zero for many 'spin configurations'. Therefore, we can express the 
weight function W(n, G) as 

W(n, G) = V(G)A(n, G), (2.4) 

where the function A(n, G) represents the restriction imposed by G, i.e., it takes the value 
1 if n is compatible with G and otherwise. 

All the functions that appeared so far also factorize into products where each factor is 
defined on a local unit denoted by u, i.e., 

W(n) = Y[w(n(u)), (2.5) 

U 

V(G) = Y[v(G(u)), (2.6) 

u 

A(n,G) = ]jA(n(u),G(u)). (2.7) 

u 

A local unit is a bond, i.e. a pair of vertices, in the case of the SW algorithm, whereas it 
is more than two vertices in general. The symbol n(u) is the part of n that concerns the 
unit u. The symbol G{u) has the similar meaning. The last equation (2.7) means that the 
restriction imposed by the graph G is collection of local restrictions imposed by its subgraphs 
G{u). Re-expressing (2.3) in the factorized form, we obtain 

w(n(u))= J2 v(G(u))A(n(u),G(u)), (2.8) 
G(«)er(u) 

where T(u) is the set of graphs for the problem at hand. 

We must assume another property of the restriction imposed by G: when we define a 
one-bit cluster variable a c = or 1 on each cluster in G, a way must exist for specifying 
an arbitrary state compatible with G in such a way that the rii depends on i only through 
a c where c is the cluster to which i belongs. Here, a cluster is a maximal group of vertices 
connected by edges to which another vertex cannot be added without loss of connectivity. 
In general, a graph G has many clusters. To be more specific, defining N C (G) as the num- 
ber of clusters in G and S(G) as a set of configurations compatible with G, we assume a 
bijection exists which maps {0, l}-^ 6 ) onto S(G) in such a way that (o"i, a 2 , • • • , cat c ) maps 
to (ni,n 2 , • • • , 7ijv„) with rii depending only on a c and i£c. 

Thus, our problem is clearly defined. Of course, it is not obvious a priori that a given 
partition function can be expressed in the form described here. Most of this paper will be 
dedicated to the derivation of this type of representation for quantum spin systems. In the 
next subsection, we will discuss how we can construct a proper Markov process assuming 
that we are already given the above representation. 



B. Cluster Monte Carlo method 

We consider a Markov process (n^\ G^) (t = 1, 2, 3, • • •) on the extended phase space 
S X T [7], where E is the configuration space, i.e. the set of possible values that n can take, 
and T is the space of graphs. The Markov process is characterized by two transition matrices: 



5 



the first one Ti,(G'\n, G) gives the probability of having a graph G' given a state (n,G), 
and the second one T F {n'\n, G) gives the probability of having a new set of vertex variables 
n'. More specifically, Tl and Tp, called labeling probability and the flipping probability, are 
defined by 

T L (G'\n, G) = Pr(G (t+1) = G"|n (t) = n, G (t) = G), 
T F {n'\n, G) = Pr(n( t+1 ) = n V° = ™, G(t+1) = G ), 

where Pr(X|Y") is the conditional probability of having an event X given an event Y . 

In [7], a special class of cluster algorithm is discussed in which each cluster can be flipped 
independently as if it were a single non-interacting degree of freedom. This type of cluster 
algorithm is called free. To obtain a free cluster algorithm [7], we need to choose the flipping 
probability as 

T F (n'\n, G) = A(n' , G)/N(G), (2.9) 

where N(G) = |S(G)| = J2n G). As mentioned in the last subsection, the symbol 

£((j) stands for a set of configurations defined by 

E(G) = {n e S|A(n, G) = 1}. (2.10) 

Because of the property of A(n, G) discussed in the last section, all variables in a cluster 
are 'locked' into a single one-bit degree of freedom and N(G) equals 2 N ^ G \ where N C (G) is 
the total number of clusters. 

Therefore, the implication of the specific form (2.9) for the flipping probability is that 
we flip each cluster in the given graph G at random with probability 1/2 as if each were a 
single one-bit degree of freedom not interacting with the others. Although we could choose 
the flipping probability different from (2.9), the choice we made is obviously advantageous 
in some respect. Computational simplicity is such an advantage. In addition, we can use 
the improved estimator [19] for the magnetic cumulants. 

Given the flipping probability (2.9), we need to choose the labeling probability as 

T h (G'\n, G) = V(G')A(n, G')/W(n) (2.11) 

in order that the limiting distribution is the one desired. It is easy to see that the two 
transition probabilities (2.9) and (2.11) have the distribution W{n, G) as their stationary 
distribution. In other words, the distribution W{n, G) is an eigenstate with the eigenvalue 
1 of both the transition matrices, i.e., 

W(n', G)=J2 T F (n'\G)W(n, G), (2.12) 

W(n, G 1 ) = J2 T h (G'\n)W(n, G). (2.13) 
Ger 

Here we used the fact that T F {n'\n, G) and Ti,(G'\n, G) are independent of n and G, re- 
spectively, and used the abbreviated notation rp(n|G) and TL(G\n) for the transition prob- 
abilities. Therefore, when the ergodicity holds for the algorithm, the only possible limiting 
probability-distribution is proportional to W{n, G). Indeed, we can prove that the algorithm 
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is ergodic in the case of ferromagnetic and XK-like models. In the case of antiferromagnetic 
model, although we have not proved the ergodicity itself, we can prove that if the conven- 
tional algorithm is ergodic, the new algorithm is also ergodic. This issue is discussed in 
Section VI. 

We also note that because of (2.3) 

E W(n,G) (2.14) 
Ger 

is the distribution that we want to generate. This fact implies that we can obtain the 
distribution W(n) simply by generating a Markov sequence described above, picking the 
vertex- variable part from the state (n^\ G^), and ignoring the graph part 

In an actual Monte Carlo simulation, the labeling process is done locally because of the 
decomposability of V(G)A(n, G), i.e., (2.6) and (2.7). Therefore, we can generate a graph 
G with the probability (2.11), simply by picking a graph G{u) for each local unit with 
probability 

P h (G(u)\n(u)) = v(G(u))A(n(u), G(u))/w(n(u)), (2.15) 

and then taking the union of these G( , u)'s as G. We can easily see that the probability (2.15) 
is properly normalized because of (2.8). 



C. The Fortuin-Kasteleyn representation and the Swendsen-Wang algorithm 

In this subsection, we briefly review the simplest and best-known cluster algorithm, 
the SW algorithm for the Ising model, and its connection with the FK representation. 
This algorithm and connection have already been discussed [3]; however, we need a more 
formal discussion to achieve our goal of establishing a mathematically rigorous background 
for general cluster algorithms. We also found that Fortuin and Kasteleyn's definitions and 
notation are not completely convenient to describe more general and complicated algorithms. 
Therefore, in this subsection, we will review the FK representation and the SW algorithm in 
a language which will eventually develop to accommodate more general ideas as we proceed. 

The Hamiltonian of the Ising model can be written as 

H = -J E( 2n * - l )( 2n o ~ !)> ( 2 - 16 ) 

where rii = 0, 1. Correspondingly, the partition function of the Ising model is written as 

Z = J2W(n). (2.17) 
n 

Here n = (ni, ra 2 , ' ' ' j n \L\) 1S a se t of \L\ one-bit variables where \L\ is the number of 
lattice points of the lattice L. The function W(n) is the Boltzmann weight which can be 
decomposed into a product of local factors defined on local units, i.e., bonds in this case, 

W(n) = l[w(n(b)), (2.18) 
b 
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where n(b) = (ni,rij) is a part of n concerning the two end-points i and j of a bond b. The 
local Boltzmann factor w(n(b)) is then defined by 

w(n(b)) = exp(K(2n, - l)(2n J - 1)), (2.19) 

where K = f3J . Equation (2.19) can be expressed as a sum of two terms each of which 
corresponds to one of two graphs often called 'deleted' and 'frozen' 

w(n(b))= £ v(g)A(n(b),g), (2.20) 
g=d,f 

where 

v(d) = e- K and A(n(6),d) = l (2.21) 

for a 'deleted' bond and 

v(i) = e K -e~ K and A(n(b),i) = 8(ni,rij) (2.22) 
for a 'frozen' bond. Substituting (2.20) into (2.18), we have 

W(n) = l['£v(g b )A(n(b),g b ), 

b 9b 

= E(n«nwu)), (2.23) 

The product Ylb A(n(b), gb) implies that a configuration n is allowed only if two variables 
rii and rij at the end-points of any frozen bond are equal. Otherwise, this product vanishes 
and the corresponding term does not contribute to the sum. We can visualize this situation 
by placing edges on frozen bonds and nothing on deleted bonds. These edges form a graph 
which we denote as G. Obviously, this graph has the same information as the set of variables 
{gb}- Therefore, we will identify these two and simply write G = {gb}- A cluster is a 
maximal set of vertices connected to each other by edges. A graph consists of many clusters, 
in general. It is clear that for a given graph G an allowed state is one where all variables 
rii in the same cluster in G have the same value or 1. In other words, all variables in a 
cluster are locked into a single degree of freedom represented by a c = or 1 where c specifies 
a cluster in G. 

If we adopt the notation used in (2.7), we can write g b = G(b), and then rewrite (2.23) 

as 

W(n) = J2V(G)A(n,G), (2.24) 

G 

where 

V(G) = Y[v(G(b)), (2.25) 

b 

A(n, G) = ]J A(n(b), G(b)). (2.26) 

b 

(2.27) 
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Substituting (2.24) into (2.17), we have 

Z = J2Y,nG)A(n,G). (2.28) 
n g 

This equation is the Fortuin-Kasteleyn representation for the Ising model. Although it is 
not explicitly indicated, Z and V(G) depend on the model and the coupling constant K . 
On the other hand, A(n, G) will be used for other models by generalizing its definition. In 
what follows, we will see exactly the same form as (2.28) with different Z and V(G) for 
other models. 

Because we have rewritten the partition function in the form discussed in Subsection II A, 
we can construct the Markov process following the general prescription given in Subsec- 
tion II B. The resulting algorithm is the SW algorithm. To be specific, given a configuration 
n, we first choose a graph G with the weight V(G)A(n, G) and then pick a new configura- 
tion n' with the weight A(n', G). The first step is equivalent to choosing a local graph G(b) 
with the weight v(G(b))A(n(b),G(b)). Because of (2.20), the probability of assigning G(b) 



to a given bond in state n(b) is 

P L (G(b)\n(b)) = v(G(b))A(n(b), G(b))/w(n(b)). (2.29) 

This result agrees with the ordinary SW labeling probability 

P L ('deleted'|(0, 0)) = P L ('deleted'|(l, 1)) = e~ 2K , (2.30) 

P L ('deleted'|(0, 1)) = P L ('deleted'|(l, 0)) = 1, (2.31) 

P L ( , frozen'|n(6)) = 1 - P L ('deleted'|ra(&)). (2.32) 



On the other hand, the second step is, as discussed in the last subsection, equivalent to 
setting each cluster variable a c to or 1 with equal probability. 

D. Generalization to quantum spin systems 

In this subsection, we will give an example of the generalization of the FK representation 
to quantum spin systems. We will discuss the S = 1 antiferromagnetic Heisenberg model in 
one dimension. The Hamiltonian is written as 

N 

H = J'£S i -S i+1 , (2.33) 

i=l 

where S? = 2 and 

[S^ 1 , Sf ] = yZ-LSijS], (a, (3, 7) = (x,y, z), (y, z, x), or (z, x,y). (2.34) 

The periodic boundary condition, i.e., S^+i = Si is assumed. We also assume that N is 
even. Then, by applying the unitary transformation 

Sf^-Sf and Sf^-Sf (2.35) 

to all sites with even i, we have 
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N 

H = -JY,hi+u ( 2 - 36 ) 

i=l 

where 

Aij^SfSf + SySV-StSl (2.37) 

We remark that these assumptions are not essential to the general algorithm described in 
the following sections. 

Usually, we take the basis in which S* is diagonalized for Monte Carlo simulations. Then, 
a basis vector is specified by a set of N classical variables, each of which takes a value —1,0 
or 1. The Hilbert space is, of course, 3^ dimensional. However, this basis is inconvenient 
for our purpose of mapping the problem into a one-bit problem of the form discussed in 
Subsection II A. Therefore, we will first map the original quantum problem into another 
quantum problem with one-bit degrees of freedom. To do this, we first decompose each spin 
operator into a sum of two Pauli operators. 

ST = + ^, 2 ))> a = x >2/> or z > ( 2 - 38 ) 

where 

= 2 ^ s ^°hy ( 2 - 39 ) 

Then, we take a basis in which the z-components of the Pauli operators are diagonalized. 
Therefore, a basis vector can be expressed as |n) where n is a set of 2N one-bit variables 

n = ™(1,2), ™(2,1), ""(2,2), "-(3,1), ""(3,2), ' ' ' ™(AT,1) , ™(AT,2)) , (2.40) 

and has the property 

a z M \n) = (2n M - l)\n), n M = or 1. (2.41) 

Clearly, the Hilbert space is 2 2N dimensional and larger than the original Hilbert space. The 
new Hilbert space is larger because it includes some singlet states which are unphysical here. 
Since we should not count such states in the partition function, the partition function in 
this basis becomes 

Z = J2( n \ Pe ~ mp \ n )- (2-42) 
n 

The projection operator P is the product of local projection operators Pi which projects out 
states with S 2 = 3/4 



n^- ( 2 - 43 ) 



It is easy to see that Pi is the symmetrization operator 



P^^I + X,) (2.44) 
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where / is the identity operator and X{ is the operator that exchanges and n^^)- 

Thus, the original problem is mapped to a problem with one-bit degrees of freedom (2.42). 
This problem still is a quantum problem which needs the evaluation of matrix elements of the 
Boltzmann density operator. Therefore, the next thing we must do is to map this problem 
to a classical one-bit problem by the ST decomposition. We can write Pe~^P in (2.42) as 

Pe~ Pn P = (p e ~ A ^p) M T ~ [(p e -ArW B p^p e -ArW p)]M Tj (345) 

where My At = (3 and 

n E = -J J2 hi+u n Q = -J E hi+i- ( 2 - 46 ) 

iieven i:odd 

The integer Mt is called Trotter-number. By inserting the identity operator 

J = 5>')(n'| (2.47) 
n' 

between the factors in (2.45), we can re-express the partition function (2.42) as 

^=E E ■■■£ 

(n 1 \Pe- ATn *P\n M )(n M \Pe- ATno P\nM-i) • • • (n 2 \P e - ATn ° P\ ni ) , (2.48) 
where M = 2M T . 

Because of the form of (2.48), it is natural to introduce a hyper-lattice which consists of 
M lattices, each of which is equivalent to the original lattice. Because the original lattice is a 
ring in the present case and we are automatically imposing the periodic boundary condition 
in the new direction, the hyper-lattice is a torus. We will refer to this hyper-lattice by a 
symbol L and each ring in this hyper-lattice by a symbol L k (k = 1,2, •• • ,M). We call 
MN lattice points in the hyper-lattice sites. As we have seen, two one-bit variables are 
defined on each site. They can be viewed as variables defined on two vertices inside the site. 
Thus, the hyper-lattice is a set of 2MN vertices. In what follows, we refer to the set of 
vectors {ni,n 2 , • • • , tlm} simply as n. We also refer to the variable on as n^^^y 

Accordingly, the vertex on which n^j^ is defined will be referred to as (k,i,/j,). The site 
that contains the vertices (&, 2,1) and (&, i, 2) will be referred to as (k,i). In Fig. 1, the 
hyper-lattice is shown for the case of N = 6 and Mt = 2. Next, we introduce the symbol 
n(V) that stands for the part of n concerning a set V where V is any set of vertices. For 
example, n((k,i,fi)) is simply n((k,i)) stands for (n(fc,;,i), n(fc,;,2)), n(L k ) is n k , and 

n(L) is n itself. Actually, a special case of this notation has already been used in (2.7). 

Having defined this notation, we note that the state \n(Lk)) can be expressed by a direct 
product of local wave functions as 

WLk)) = (g) |n(/ (fc , )) 2> (2.49) 

iieven 

where l(k,i) is a pair of sites + 1)}, or equivalently a set of four vertices 

{(&,2,1),(&,2,2), (fc,i + U),(M + l,2)}. 
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FIG. 1. The 'checkerboard' hyper-lattice for the 5=1 XY model on a 
chain of the length 6 with Trotter number 2. Only shaded plaquettes are called 
'plaquettes' here. On each plaquette four-body Boltzmann factor is defined. 



To be more specific, \n(l^ k ^)) 2 is an eigenvector of four Pauli operators ^ and cr^ i+1 ^ 
(/x = 1,2) in a 2 4 dimensional local Hilbert space. Therefore, because [A^+i, Ajj + i] = if 
\i-j\ > 2, we obtain 



(n(L k+1 )\Pe- ATn *P\n(L k )) = w(n(p [k ^)), 

iieven 

where P( k ,i) is a plaquette defined by P( k> i) = l(k,i) U and 

w{n{p {Ki) )) = (n{l {k+1 ^)\P l P l+1 e KKi ' i+1 P l P l+1 \n{l {Ki) )) 2 , 



[2.50) 



:2.5i: 



with K = At J. An expression similar to (2.50) is available for rio. Therefore, the partition 
function (2.48) can be rewritten as 



Z = J2W(n) 



n 



where 



W(n) = l[w(n(p)). 



;2.52) 



:2.53) 
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The product in (2.53) is taken over P(k,i) with even k + i. 

Here we note that w(n(p)) depends n(p) only through m^,i) = X)^ ^(fc.i,^) where {k, i) 
one of four sites in the plaquette p. Therefore, we can re-express w(n(p)) in the form 



is 



w(n(p)) = w m ^ TO (*+W> . 
We further note that this w has several symmetry properties 



:2-54) 



w 



w 



m t i m tr 
m b i m br 

m tr m t i 
m br m b i 



10 



10 



m b i m br \ 
m t i m tr y 

25 — m b i 25 — m br 
2S — m t i 25 — m tr 



:2.55) 



where 5 = 1 in the present case and the suffix ' tl' means top-left corner of the plaquette, 'tr' 
means top-right, etc. These symmetry properties define classes of states which correspond 
to the same local weight w regardless of the value of K . In what follows, we will specify one 
of these classes by a symbol S{jp). The 'particle number conservation', imposes a restriction 
on w, i.e., w is non-zero only if 



mu + m hl = m t i + m tr 



:2.56) 



We can eliminate from consideration some of the classes that violate this condition. As a 
result, only seven distinct values among 2 8 = 256 values of w(n(p)) exist, i.e., there are 
seven relevant classes of states, 



i5(2) 

w(3) 
i5(4) 
i5(5) 
i5(6) 

w(7) 



w 



w 



w 



w 



w 



w 



12r 

It +6r" 1 +4r" 2 

It -6r _1 +4r" 2 
-4r +4r" 2 
— 6r +6r _1 

6r +6r _1 

8r +4r" 2 



C2.57) 



where r = exp(— K ). 

Equations (2.52) and (2.53) have the same form as (2.1) and (2.5). Therefore, the next 
thing we have to find is a set of local graphs T(p) and coefficients v(G(p)) that satisfy (2.8) 
with u replaced by p 



w(n(p))= J2 v(G(p))A(n(p),G(p)). 
G(p)er( P ) 



(2.58) 
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To this end, we consider the graphs which correspond to pairing of the eight vertices in p, i.e., 
graphs which consist of four edges sharing no vertices. From the 105 such graphs, we take 
only those graphs that consist of vertical or horizontal edges. Here, a vertical edge connects 
two vertices whose spatial indices are the same and the temporal indices are different whereas 
a horizontal edge connects two vertices whose spatial indices are different and the temporal 
indices are the same. There are 24 such graphs. When we neglect distinctions between 
vertices in the same site, these 24 graphs are classified into three classes which are represented 
by the diagrams in the leftmost column in Fig. 2. We use a symbol Q{p) to specify a class 
of graphs. 
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FIG. 2. The coefficient N(S(p), G(j>)) for the 5 = 1 antiferromagnetic 
Heisenberg model. S(j>) specifies a class of states while G(j>) specifies a class 
of graphs. In the diagrams of the leftmost column, a solid line stands for a 
green edge and a dashed line a red edge. 



We next define A(n(p), G(p)) in (2.58) as the function that takes the value of 1 only 
if, for any edge connecting two vertices v and v 1 ', n(v) = n(v') when the edge is vertical 
and n(v) = 1 — n(v') when the edge is horizontal. Otherwise, this function takes the value 
of zero. For later convenience, we call edges for which two vertex variables have the same 
value green and all other edges red. In the present case, all vertical edges are green whereas 
all horizontal edges are red. All edges in what follows have this additional attribute, i.e., 
'color'. With this definition, we can formally specify the set of graphs T(p) in (2.58) as 
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T(p) = "Every vertex is an end-point of one and only one edge" and 

"Any edge is either green vertical or red horizontal."} (2.59) 

Given (2.57) and (2.59), we next find a solution v(G(p)) of (2.58). It is natural to seek 
a symmetric solution in which v(G(p)) depends on G(p) only through the class to which 
G(p) belongs, i.e., v(G(p)) = v(Q(p)) where G(p) (E G(p)- Taking this into account, we can 
rewrite (2.58) as 

(s(p)) = E N{s{p), g(p)wg(p)). (2.60) 

Here, the function N is defined by 

N(S(p),g(p))= £ A(n(p),G(p)), (2.61) 
G(p)eO( P ) 

where n(p) £ S(p). The function N is shown in Fig. 2. We can easily see that the solution 

v(l) = lw(l), 
5(2) = itf(5), 

5(3) = lw(3) (2.62) 

satisfies (2.60). 

Now, since we obtained the representation in the form discussed in Subsection II A, 
the Monte Carlo method in Subsection II B applies. The resulting algorithm is the loop 
algorithm used in [17]. 

In this subsection, we found the solution v(G(p)) given a set of graphs T(p), but did not 
show how we choose T(p) or how we obtain the solution v(G(p)) in general. These questions 
will be answered in the following sections. 



III. CLUSTER REPRESENTATION OF GENERAL XX Z MODEL 

In Section II D, we saw how S = 1 antiferromagnetic Heisenberg chain is mapped to 
one-bit classical problem. In this section, we will see how the mapping is done in the case of 
general XX Z spin systems. Our task can be divided into two parts. In Subsection III A, we 
formulate the problem as a one-bit classical problem with a weight function that factorizes 
into a product of local factors. Each one of the local factors is defined on a local unit called 
a plaquette, as we saw already. At this stage, graphs are not yet introduced. In short, in 
Subsection III A, we will formulate the problem in the form of (2.1) with (2.5). Then, in 
Subsection III C , we show without derivation how w(n(u)) can be expressed as a sum of 
terms each of which corresponds to a local graph G(u); that is, we will present an explicit 
form of (2.8) for the XXY model. This expression leads to (2.3). The derivation will be 
presented in Section IV. To make the discussions clearer, in Subsection III B, we summarize 
and generalize the notations and definitions introduced in Subsection II D concerning graphs. 
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A. Mapping to a classical problem 



Our Hamiltonian is 



(i,3) 

= - E JiASfs? + s»s» + \ hJ s?s>), (3.1; 



with J?- = J^- = Jij and J? - = Jij^ij- The operator 5™ is the spin operator which satisfies 

5? = (S? f + (S«) 2 + [Stf = S(S + 1) (3.2) 

and 

for (a,/3,7) = (x,y,2;), (y,2;,x), or (z,x,2/). The symbol in (3.1) is an arbitrary 

undirected pair of elements of the set of lattice points L. The constants Jij and Xij can be 
any real numbers. We are not assuming any particular geometric feature of the lattice. It 
can have any dimension and does not even have to be translationally symmetric. 

The first important step is to express a spin operator in terms of sum of 2S Pauli 
operators as we did in Section II D, 

-1 2S 

St = 2 E (3-4) 

To each Pauli matrix, we will assign a vertex. Therefore, it is convenient to define L as 
a set of vertices defined on the lattice L. We work with a complete set of states that are 
simultaneous eigenfunctions of the operators a* [i = 1, 2, • • • , \L\; \i = 1, 2, • • • , 2S). Here, 
\L\ is the number of the lattice points in the lattice L. To be more specific, 

O(Z)} =(2n^-l)|n(i)), (3.5) 
where n(L) represents 2S , |L| one-bit variables, 

n(L) EE (71(1,1), 71(1,2), • • • , ™(|L|,2S))- (3-6) 

The partition function in this basis becomes 

Z= J2 {n(L)\Pe- fi7i P\n(L)), (3.7) 
n(L) 

where P is the projection operator defined by 

P = UPi, (3.8) 
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and Pi is the projection operator to the space in which S? takes the value S(S-\-l). Since this 
representation with Pauli matrices provides a single representation for which S? = S(S + 1), 
(3.7) is the correct representation of the original model. 

As we saw in Subsection II D, using the ST approximation [12] and (3.4), we can map the 
original quantum problem into a problem with classical one-bit degrees of freedom. However, 
since there are so many variants of the ST approximation, it is impractical to describe all of 
them. Therefore, in what follows, we will only describe properties which are shared by all 
known variants. 

Once mapped, the problem has 2SM\L\ one-bit variables, where M is some integer pro- 
portional to the Trotter- number Mt- The proportionality constant depends on the variant 
of the ST approximation. The 2SM\L\ variables naturally fit into a d + 1 dimensional 
hyper-lattice, where d is the dimension of the original lattice L. We call the first dimen- 
sion temporal or vertical and the other d dimensions which correspond to the dimensions of 
the original lattice spatial or horizontal. This lattice consists of M layers of d- dimensional 
lattices each of which is equivalent to the original lattice L. We number these layers with 
an index k = 1,2, • • • ,M and use to denote the k-th layer. We call a lattice point in 
this hyper-lattice a site. 2S vertices are associated with each site, and a one-bit variable is 
defined on each vertex. We label a site with two indices {k,i), with the first index specifying 
the temporal location of the site, and the second index specifying the spatial location. As 
for vertices, we use symbols {k, i, fi) by adding an additional index \i = 1, 2, • • • , 2S specify- 
ing the fi-th vertex in the site (k,i). The boundary condition in the temporal direction is 
periodic, i.e., k = M + 1 is identified with k = 1. On the other hand, we are not assuming 
any particular spatial geometry; therefore, the present scheme can be applied to models in 
any dimensions, with any boundary conditions, and with any range of the interactions. 

A horizontal link is a pair of sites whose temporal indices are the same. A vertical link is 
a pair of sites whose spatial indices are the same and temporal indices differ by one. Some 
of the squares which consist of two vertical links and two horizontal links play a special role, 
because a local weight function is defined on them. We call such a square a plaquette and 
often use a symbol p for it. The set of the 8S vertices associated with four corners (sites) 
of a plaquette will also be referred to as a plaquette and represented by the same symbol. 
Which square among all possible squares is to be a plaquette depends on the variant of the 
ST approximation. In any variant of the ST approximation, no vertical link is shared by 
more than one plaquette. The symbol U p denotes the set of plaquettes whereas the symbol 
Ui denotes the set of vertical links which do not belong to any plaquette. In the example 
presented in Subsection II D, Ui is an empty set. In other cases, such as the systems defined 
on a triangular lattice, Ui is non-empty. We also define U =U P UUi. This is the set of local 
units that we will consider in what follows and the products in (2.5), (2.6) and (2.7) should 
be taken over this set in the present case. 

It is convenient to have notations by which we can refer to a specific site or a link 
in a given plaquette p. To this end, we consider a plaquette p consisting of four sites 
{k,i), (k,j), {k + and {k + 1, j). We first take either one of spatial indices i and j and 
call it 'left' and the other 'right'. We use the symbols ii(p) for 'left' index and i r (p) for the 
other. (Which one we call 'left' does not matter in what follows.) We also define symbols 
h(p) and h{p) as the top and bottom horizontal links of p. The symbol s t i(p) stands for a 
top-left site. Other symbols s tr (p), su(p) and Sbr(p) are defined in a similar fashion. The 
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relative locations of these symbols are illustrated in Fig. 3. 




FIG. 3. The hyper-lattice on which the transformed problem is defined 
and a plaquette on which a four-body Boltzmann factor is defined. The upper 
link of the plaquette p belongs to the layer and its lower link belongs to 

the layer Lk. The open ovals represent sites. 



With these definitions and the ST approximation, we can rewrite the partition function 
(3.7) as 

Z=J2 sgn(n)W(n), (3.9) 

with the sign factor sgn(n) and the Boltzmann weight W{n) defined below. The sign factor 
is defined so that W{n) is always non-negative. The configurations with negative sign appear 
only for a frustrated system such as the antiferromagnetic Heisenberg model on a triangular 
lattice. Although the sign factor is physically important, from the computational point of 
view, it is needed only when adding up measured values at each Monte Carlo step and does 
not affect the Markov process. In other words, we usually neglect the sign factor in defining 
the Markov process and the limiting distribution of the resulting process is W{n). Since the 
goal of the present paper is to accelerate the Markov process, we neglect the sign factor in 
(3.9). The weight function W(n) in (3.9) is defined by 

W(n) = J] w{n(jp)) II *i(»(0)- ( 3 - 10 ) 
pelt? leUi 

Since there is no interaction which corresponds to a vertical link in Ui, 6i(n(l)) should be a 
function which takes the non-zero value 1 if and only if ri(k,i,n) = n (k+i,i,n) f° r a ll M where 
l = {{k,i),{k + l,i)}. 
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The function w(n(p)) in (3.10) is the absolute value of matrix element of the local 
Boltzmann operator, that is, 

w(n(p)) = \(n(l t (p))\p(p)\n(l b (p))) 2 \, (3.11) 

Here, (• • -) 2 is the matrix element in 2 4S dimensional Hilbert space on which the operators 
cr™ and o""^ operate, where \i = 1, 2, • • • , 2S and a = x,y, z. The operator p(p) in (3.11) is 
defined by 

Hp) = p l i{p) p l r{p)p{v)P H {p)P l r{p)- (3-12) 

with 

p(p) = e ^(j2 K p^^)) (3-13) 

Here A(/x, u) is defined by 

Aijbi, v) = + <( P ), M < (P ),, + V^.XcpV- ( 3 - 14 ) 

The constant K p in (3.13) depends on the variant of the ST-decomposition. However, in 
any variant of the ST-decomposition, the following equation holds: 

Y, K p =(3J ltJ /4: for all (i, j), (3.15) 
p 

il{p) = i 
*r(p) = j 

where the summation is over the plaquette whose spatial location is specified by two spatial 
indices i and j. 

As a function of K p and A p , the local weight function w(n(p)) has the following property 

[ w ( n (p))]K P ,\ P = [w(n(p))]_ Kpj _ Xp (3.16) 

Because of this identity, we can assume positive K p without loss of generality. With this 
assumption, we can rewrite (3.11) simply as 

w(n(p)) = {n(l t (p))\~p(p)\n(l b (p))) 2 > 0. (3.17) 

In order to prove the identity (3.16), we notice that 

/ a ( \\\~( M (if \\\ {{m t i,m tr \p(p)\m bh m br )), 

(n(l t (p))\p(p)\n(l b (p))) 2 = (3.18) 

Y V m tJ V m W V m bJ V m 

where 

rax = Yj Uv ' X = tl,tr,bl ot br, (3.19) 

vesx(p) 

and |mi,m 2 )) is the eigenstate of Sf, S? and Sj which satisfies 
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Since the term 



S?\m 1 ,m 2 )) = S 2 j \m 1 ,m 2 )) = S(S + l)\m 1 ,m 2 )), (3.20) 
S-\m 1 ,m 2 )) = (-S + mi)|mi,m 2 )), (3.21) 
Sj\m 1 ,m 2 )) = (-S + m 2 )\m 1 ,m 2 )). (3.22) 



S h(p) S 1(p) + S 1{p) S 1{p) ( 3 - 23 ) 

in (3.13) corresponds to transferring a 'particle' (i.e. a vertex with vertex variable 1) from 
one side (left or right) to the opposite, a matrix element 

((m t i,m tr \p(p)\m b i,m br )) (3.24) 

with an odd value of m t i — mj; = —m tr + must be an odd function of J x = J y = J. For 
the same reason, if m t i — is even, the matrix element must be an even function of J. 
Therefore, changing the sign of K and A at the same time, which is equivalent to changing 
the sign of J x = J y with J z fixed, may change the sign of some of matrix elements, not the 
absolute values. Therefore, because of (3.11) and (3.18), (3.16) follows. 

In (3.10), we employed a method of decomposition in which the two-body interaction 
in the original Hamiltonian corresponds to a four-body interaction on the plaquettes. It is 
possible to decompose the partition function in such a way that the local units are cubes, 
for example, instead of plaquettes. This possibility will be briefly discussed in Section VI. 



B. Notation and Definitions 

In Section II, we introduced several notations and definitions to present two pedagogical 
examples. Although most essential concepts have been presented there already, we need to 
extend them to describe the algorithm for more general cases in an unambiguous fashion. 
In this subsection, we will summarize and generalize those notations and definitions. 

A graph G = (V G , E g ) is defined by two sets V G and E G . We call elements of V G vertices 
and elements of Eq edges. In what follows, two symbols v and e, often with subscripts and 
superscripts, denote a vertex and an edge. Every edge has two attributes: two vertices at 
its end-points and color, i.e., green or red. 

A graph G' = (V G ',E G i) is called subgraph of G = (V G ,E G ) if V G > C V G , E G , C E G . A 
path P is a special graph whose vertices and edges can be numbered in such a way that the 
2-th edge's end-points are the 2-th and the {i + l)-th vertices. Two vertices v and v' are 
called connected in G when a path in G exists whose first and last vertices are v and v'. A 
path is called green when it contains an even number of red edges. Otherwise, it is called 
red. If the first and the last vertices of a path are the same, i.e., if the path is closed, it is 
called a loop. A subgraph C of G is called a cluster if any two vertices in C are connected 
in G, any edge in G connecting two vertices in C belongs to C, and no vertices or edges can 
be added to C without violating the two previous conditions. A union of two graphs G and 
G' is defined by 

GUG' = (V G U V GI ,E G UE G ,). (3.25) 
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In Subsection II D, we saw that a special type of edges, namely, green vertical or red 
horizontal ones, played a particularly important role in representing the local weight w(n(p)) 
in the form (2.58). We will see in the next subsection that different types of edges are 
important for different types of anisotropy of the models. Therefore, it is useful to classify 
open paths and edges into several categories. They are classified according to the relative 
locations of their end-points: A vertical path connects vertices whose spatial indices are the 
same but have different temporal indices, a diagonal path connects vertices whose spatial 
and temporal indices are different, a horizontal path connects vertices whose spatial indices 
are different but the temporal indices are the same, and a recurrent path connects vertices 
whose spatial and temporal indices are the same. The vertical paths and the diagonal 
paths are both called temporal, whereas the horizontal paths and the recurrent paths are 
called spatial. Some of these paths are special for ferromagnetic models and some others for 
antiferromagnetic ones. We will call a green vertical or a green diagonal path ferromagnetic, 
and a green vertical or red horizontal path antiferromagnetic. 
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FIG. 4. The types of edges. Red edges are represented by dashed curves 
whereas green edges are represented by solid curves. 



A green vertical path is both ferromagnetic and antiferromagnetic while a recurrent path is 
neither. Since a graph with only one edge and two vertices that the edge connects can be 
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viewed as a special open path, we can apply the above classification to edges as well. These 
definitions are summarized in Fig. 4. 

As we saw already in the last section, we define not only graphs but also a set of vertex 
variables n on the same vertex set as graphs. With these one-bit variables n and graphs G, 
we can express the partition function in the form of (2.1) with (2.3). Those equations can be 
viewed as representing a system that consists of vertex variables and graphs interacting each 
other. From (2.4), it is clear that the function A(n, G) is the 'interaction' between vertex 
variables and graphs. In the two examples presented in the last section, we saw two of its 
special forms. Here, we restate its definition given in Subsection II D for S = 1 Heisenberg 
model in a more clear fashion and show a few of its important properties. The function 
A(n(V G ),G) is defined by 

A(n(V G ),G) = J] ee(n(e)) (3.26) 

eeE a 

with a one-bit function e e (n(e)) which takes value 1 if e is green and the vertex variables 
take the same value at the two end-points, or if e is red and the variables take different 
values. Otherwise, the function takes value 0. If A(«(V'g), G) = 1, the graph G and ti(Vg) 
are called compatible with each other. Examples for compatible and incompatible cases are 
shown in Fig. 5. 



(a) compatible (b) incompatible 




• ■ 


.. n(v) = 1 


O ■ 


.. n(v) = 



FIG. 5. Two local configurations a) compatible and b) incompatible with a 
given graph. The vertices for which the vertex variables are 1 are represented 
by solid circles, the other by open circles. 



It is obvious from the definition that A(tc(Vg),(j) is non-zero if n v = n v i for any pair of 
vertices v and v' connected by a green path and if n v = n v i = 1 — n v i for any pair of vertices 
connected by a red path. For two graphs G and G' ', the following identities hold: 
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A(n(V G ),G)A(n(V G ,),G')= J] e e (n(e)) J[ e e (n(e)) 

eEE a eEE a , 

= II ^(n(e)) J] ^(n(e)) J] (^He))) 2 

eeE a \E a , eeE a ,\E a eeE G nE a , 

= J] £ e(n(e)) = A(n(V GuG ,), G U G"). (3.27) 

e ^ E aua' 

In deriving this identity, we have used the fact that e 2 = e. This identity is useful and will 
be used in what follows. As a corollary, we obtain 

A(n(V G ),G)= J] A(n(Vb),C). (3.28) 
cec(G) 

where C(G) is the set of clusters in G and Vc is the vertex set of a graph C . The proof is 
straightforward using (3.27) if we note that 

U C = G. (3.29) 

cec(G) 



C. Local Weight Equation 

Given the A-functions defined in the last subsection, our task is to find a set of graphs 
T(p) and weight of graphs v(G(p)) that satisfy the local weight equation 

w(n(p))= £ v(G(p))A(n(p),G(p)), (3.30) 
G( P )er(p) 

for various given weights w(n(p)). Once we obtain the form (3.30), by substituting it into 
(3.10), we can express the Boltzmann weight as 

W{n)=]\ £ v(G(u))A(n(u),G(u)), (3.31) 
ueu G(u)er( u ) 

where v(G(l)) for / £ Ui is 1 and A(n(Z), G(l)) is Si(n(l)), i.e., T(l) contains only one graph 
G(l) that consists of 2S green vertical edges each of which connects two vertices with the 
same spatial indices i and vertex indices \i. With the use of (3.27), the above equation can 
be written as 

= E ( II «(<?(«)))( II A(n(«), (?(«))), 

Ger ueu ueu 

= J2V(G)A(n,G), (3.32) 
Ger 

where V(G) = Yl u v(G(u)) and G = U u G(u). Since this representation has the form dis- 
cussed in Subsection II A, we can construct the cluster algorithm following the prescription 
given in Subsection II B. 

In Subsection II D, we defined graphs G(p) whose vertex set is a plaquette with 8 vertices. 
We also defined a special set of graphs in (2.59) over which the summation in (3.30) should 
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be taken. Obviously, for the general XXZ models, we have to define G(p) on a plaquette 
of 8S vertices not on one of only 8 vertices. Furthermore, as we will see in the next section, 
we have to take different types of sets as T(p) in (3.30). In this subsection, we present the 
essential result of the next section, i.e., we present explicitly the set of graphs T(p) with 
which the local weight equation (3.30) can have meaningful solution v(G(p)). 

In the next section, we will find that the proper set of graphs T(p) depends on the 
anisotropy of the problem. Therefore, we will separately treat five distinct cases: 1) XY- 
like (|Aij| < 1), 2) isotropic ferromagnetic (Xij = 1), 3) isotropic antiferromagnetic (Xij = 
— 1), 4) ferromagnetic Ising-like (Xij > 1), and 5) antiferromagnetic Ising-like (Xij < — 1) 
anisotropies. Corresponding to these five cases, we will define five sets of graphs. All graphs 
belonging to any one of these five sets share a property in common. To be more specific, all 
the five sets are included in a larger set of graphs 

T xxz {p) = { G I V G = p, 

"Any cluster in G have even number of vertices," and 

"Any edge is green temporal or red spatial." }. (3.33) 

The first condition reflects the rule that, only local configurations n(p) for which ^2 vep n v 
is even have non-zero weight w(n(p)). In other words, as long as the first condition holds, 
it is guaranteed that we move from an allowed state to another allowed state by flipping 
a cluster. The second restriction imposed upon the edges for the XXZ model reflects 
the fact that z-component of total magnetization commutes with the Hamiltonian and is 
conserved. Flipping those edges, green temporal or red horizontal, automatically results in 
a state for which this conservation law holds. As we will see in the Appendix, we do not 
have this restriction for the XY Z model in which z-component is not conserved while the 
first condition is still valid. 

The five subsets are defined as 

T XY (p) = {Ge T xxz (p) | "No two edges in G share a vertex." }, (3.34) 

T FH (p) = {Ge T XY (p) | "Every edge in G is ferromagnetic." }, (3.35) 

r AFH (p) ee { G E T XY (p) | "Every edge in G is antiferromagnetic." }, (3.36) 

T F (p) = {Ge T xxz (p) | "Every edge in G is ferromagnetic." }, (3.37) 

T AF (p) = {Ge T xxz (p) | "Every edge in G is antiferromagnetic." }. (3.38) 

We note that 

T FH = V XY n V F (3 39) 

T AFH = T XY n j^F (3 40) 

A typical graph for each case is illustrated for the S = 3/2 case in Fig. 6. 
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(b) FH 



(d)F 




FIG. 6. Typical examples for the five classes of graphs for a) Xy-like, 
b) isotropic ferromagnetic, c) isotropic antiferromagnetic, d) ferromagnetic 
Ising-like, and e) antiferromagnetic Ising-like anisotropies. 



In the next section, we will show that with these sets of graphs, at least one non-negative 
set of coefficients v(G(p)) exists that satisfies (3.30) for each type of anisotropy. In this paper, 
we do not give the explicit form of those solutions except for the special case of Heisenberg 
models. Instead, we will present the method to compute the coefficients. 

It is interesting to consider what kind of clusters will be formed when we take the union 
of local graphs, assuming that all plaquettes in the system have the same type of anisotropy. 
One striking difference is seen between the XK-like and the Ising-like anisotropic cases. 
For XK-like anisotropy, in a local graph G{jp), a vertex is an end-point of one and only 
one edge. This means that every vertex is shared by two and only two edges, when we 
take into account the fact that every site is shared by two local units, i.e., plaquettes or 
vertical links. Therefore, when we take the union graph G = \J u G(u), it consists of a 
number of loops without branching. Therefore, the present scheme naturally leads to a 
loop algorithm in the case of XK-like anisotropy. On the other hand, we generally have 
branching in the case of Ising-like anisotropy, because a vertex can be shared by any number 
of edges. Another important difference is that, in the case of XK-like and ferromagnetic 
Ising-like anisotropies, for any worldline in the current configuration, such a loop can form 
that coincides the worldline because the graphs include both vertical and diagonal edges 
and a worldline also consists of vertical and diagonal segments. Here a worldline is a line 
connecting vertices with vertex variable 1, This means that, in those cases, any worldline can 
vanish in a Monte Carlo step. On the other hand, in the case of antiferromagnetic Ising-like 
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anisotropy, a worldline with diagonal segments cannot vanish because a loop which overlaps 
this worldline can not form. This may cause a problem concerning ergodicity as we will 
discuss in Section VI. 

IV. CLUSTER REPRESENTATION OF THE LOCAL BOLTZMANN FACTOR 

In this section, we show that the local Boltzmann factor w(n(p)), defined on a plaquette 
p, can be expressed by a sum of terms, each of which corresponds to a graph. In other 
words, the goal of this section is to prove the weight equation (3.30) has at least one mean- 
ingful solution with a set of graphs (3.34), (3.35), (3.36), (3.37), or (3.38) depending on the 
anisotropy of the problem. We also propose a method to compute the coefficients v(G(p)). 
For the general case, we will show how to compute them numerically, and for the Heisenberg 
models, we will provide compact formulae for the isotropic case. 

Because we will focus on only one pair of interacting points in the original lattice L, 
which we refer to as a and b, the corresponding lattice we consider in this section has 
just two lattice points when projected onto a horizontal plane. What we will discuss is 
the multiplication of two operators defined in the local Hilbert space on these two lattice 
points (in the original lattice). Because an operator in this space can be represented by a 
plaquette, it is sufficient to take two plaquettes, one stacked on the other, in order to discuss 
the multiplication. Therefore, the size of 'hyper-lattice' we consider here in the temporal 
dimension is three. 

To avoid confusion, we emphasize that this hyper-lattice, whose size is two in spatial 
direction and three in temporal direction, is not necessarily the subset of the hyper-lattice 
discussed in the previous sections. The spatial indices a and b, of course, correspond to 
the spatial indices in the last section, because they correspond to the lattice points in the 
original lattice L. The index \i also corresponds to the same index in the last section 
and specifies a vertex in a given site. However, the temporal indices in this section do 
not correspond to those in the last section. Here, they are merely labels introduced to 
distinguish one state vector from another. For example, stands for 45 one-bit variables 
n (k,i,n) {i = a i b; /J- = 1, 2, • • • , 2S). The state {11^)2 is an eigenvector of the z-components of 
Pauli operators, i.e., 

a !j n k)2 = (2n (fciliM ) - l)|TC fc ) 2 . (4.1) 

Defining 4 as a set of two sites with the temporal index k, i.e., {(&, a), {k, &)}, we can rewrite 
rik as n(Zfc). We will use this notation in what follows. 

Another useful tool for the discussion is operators whose matrix elements are given by 
the A-functions defined in Subsection III B. We define an operator A(G(p)) by 

(n(l t (p))\A(G(p))\n(l b (p))) 2 = A(n(p), G(p)). (4.2) 

Here, h(p) and lb(p) are the top and bottom link of the plaquette p as defined in Subsec- 
tion III B. The operator A(G(p)) depends on p only through i r (p) and ii(p) and does not 
depend on the temporal index k. 

To outline the proof before going into its detail is useful. We will introduce an operation 
called the contraction of two A-operators in such a way that the product of two operators Aq p 
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and Aqi p is Ag» multiplied by a scalar factor where G" is the graph resulting from contraction 
of G p and G' . In other words, the contraction of two graphs corresponds to multiplication 
of two operators. This relationship enables us to discuss the nature of the operators in 
a graph-theoretical language. The most important feature that we will prove is that a 
set of operators 0* is closed with respect to multiplication. Here, the symbol * stands for 
l XY\ 'FH', 'FAH', 'F' or 'AF', corresponding to XY-like, isotropic-ferromagnetic, isotropic- 
antiferromagnetic, ferromagnetic-Ising-like, or antiferromagnetic-Ising-like anisotropy, and 
0* is the set of operators o which can be expressed as a sum of elements of T* multiplied by 
some non-negative coefficients v(G(p)). Formally, 



o* 



E v(G( P ))A(G( P )) 
G(p)er*( P ) 



(4.3) 



The sets of graphs T* are the ones defined in Subsection III C . Then, we can show that 
p in (3.12) belongs to one of these sets of operators because 1) the product of projection 
operator in (3.12) belongs to all the above sets of operators, 2) all the operators A(/x, u) in 
(3.13) become elements of one of the above sets of operators when a scaler operator xl is 
added to it (/ is the identity operator and x is some real number), 3) all of the above sets of 
operators are closed with respect to the multiplication of two elements, the multiplication of 
an element by a non-negative real number, and the addition of two elements, 4) the number 
of graphs which correspond to distinct A-operators is finite, and 5) the Taylor expansion 
series of p with respect to K p converges. In short, we will show that a set of non-negative 
coefficients v(G(p)) exists that satisfies 

p= £ v(G(p))A(G(p)), (4.4) 
G( P )er* 

When expressed as an equation between matrix elements, (4.4) reduces to (3.30). 

A. The XY-like anisotropy (|A P | < 1) 

The first step in proving the statement (4.4) in the case where |A P | < 1 is to note the 
following identity: 

A(M, v) = + <A + = -! + (! + A)i(/i, u) + (1 - \)B(n, u). (4.5) 

where A and B are operators defined by 
(n(l 2 )\A( f i,v)\n(l 1 )) 2 = 

^(^(2,a ltt ), ^(1,6,^)^(^(2,6^), ^(l,a ltt )) II ^( n (2,a,a), ^(l,a,a))^(^(2,6,/3), n (l,6,/3)) (4-6) 

(n{h)\B{n,v)\n{h)) 2 = 

^(^(2,a, tt ),^(2,6, V ))^(^(l,a, tt ),^(l,6, V )) II ^( n (2,a,a), ^(l,a,a))^(^(2,6,/3), n (l,6,/3))- (4-7) 

Here, the symbol n denotes 1—n. It is easy to see that the operator A(p,, u) can be expressed 
as 
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where A(/j,, u) is a graph shown in Fig. 7(a). 



(4.8) 



2,a,l) 2,a,2 2,b,l) (2,b,2 




Strip) (b) StKp) <Oi^ <Q)Stt(p) 



Sbl(p) 



l,a,l l,a,2 l,b,l) (l,b,2 



Sbl(p) 



B(2,l)= 



Sbrfp)®' 



5>Sbl(p) 



stifp) (d) sti(p)^^ „^p) Str (p) 
D(2,l)= 



SbrlpKO) 



Sbl(p) 



FIG. 7. Examples for the graphs which define (a) A, (b) B, (c) (7 and (d) D 
in the case of S = 1. The Hamiltonian of a JJZ model can be expressed as a 
sum of these four types of operators. For models with Xy-like anisotropy, only 
graphs of type (a) and (b) are sufficient to express the Hamiltonian whereas 
(b) is replaced by (c) in the case of ferromagnetic Ising-like anisotropy and by 
(d) in the case of antiferromagnetic Ising-like anisotropy. The ovals are sites, 
the solid circles are vertices and the curves are edges. In the graphs for the 
XX Z models, all spatial edges are 'red', and all temporal edges are 'green'. 



Similarly, for the operator Br^), we have 

with the graph B(fi, v) shown in Fig. 7(b). 

Using the identity (4.5), we can rewrite the exponential operator in (3.13) as 



(4.9) 



exp 



^ £ ( - 1 + (1 + A p )i(/z, + A p )£(/z, v 



-K 9 2 

e Api exp 



^ £ ((1 + A P )A(^) + (1 - A p )£(^ 



(4.10) 
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The exponent of the last factor 

jEf^ X r)Mv, ") + (1 " K)B{^ ")) (4-11) 

is an element of XY because K p , 1 + X p and 1 — A p are non-negative and XY is obviously 
closed with respect to the addition of two elements and the multiplication of an element 
by a non-negative number. Additionally, both A(p,,v) and B(p,,v) belong to T XY [p), or 
equivalently, v) and B(fi, v) belong to XY . 

On the other hand, matrix elements of the projection operators in (3.12) can simply be 
written in the following form. 



2S 

(n(l 2 )\P a P b \n(l 1 )) 2 = " II %(2,a,,r(,i)),ra(l,a,,i))%(2,6, 1 rV))> rl (l.M)- ( 4 " 12 ) 



where ir and 7r' stand for permutations of the set {1, 2, • • • , 2S} and the summation is taken 
over all possible permutations. Therefore, this product of the projection operators can be 
viewed as a sum of A-operators for which the graph consists of 2S green vertical edges 
which connect (l,a,/x) to (2,a,7r(/x)) and (1,6, /x) to (2, b, 7r'(/x)). Therefore, the operator 
P a Pb belongs to all five sets of operators defined in (4.3). To summarize, we have shown 
that p in (3.17) can be expressed in the form 

p = Ye x Y (4.13) 

with two elements X and Y of XY . 

We now consider an arbitrary set T(p) of graphs defined on a plaquette p and its two 
arbitrary elements, G\ and G 2 - If there exists an element G in T(p) for which 

A(G' 1 )A(G' 2 ) = 2 m A(G) (4.14) 

holds with some non-negative integer m, we call T(p) closed with respect to multiplication. 
With this definition, we will prove the following statement: 

Lemma 1 Y XY {jp) is closed with respect to multiplication. 

If we assume Lemma 1, it is obvious that XY defined in (4.3) is closed with respect to 
multiplication. 

We first assume that the above Lemma is true and show that p expressed in the form 
(4.13) belongs to XY . To this end, we consider the Taylor expansion of e x with respect 
to X. We then neglect the terms of the {n + l)-th order and higher. The k-th. order term 
in the Taylor expansion is (l/k\)X k and this obviously belongs to XY . So does the k-th 
order term of p, i.e., {l/k\)YX k Y. Thus the ra-th order approximant of p also belongs to 
XY . In other words, the operator p in (3.12) can be approximated up to the ra-th order by 

~(n) G QXY Namelyj 

pnpW= £ v^\G(p))A(G(p)), (4.15) 
G( P )erxY {p) 
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with non-negative coefficients v^(G(p)). Here we note that the number of distinct elements 
of T XY (p) is finite. We also note that the series v^ n \G(p)) (n = 1, 2, 3, • • •) is monotonically 
convergent to a non-negative value because the contribution from each order term to v^ n \G) 
is non-negative. These facts and (4.15) make it obvious that in the limit of n — > oo the 
operator p^ can still be expressed in the form (4.15), i.e., it belongs to XY . Thus we get 
the following theorem: 

Theorem 1 If \X P \ is not greater than unity, the operator p can be decomposed into a sum 
of A-operators of the form 

p= £ v(G(p))A(G(p)), (4.16) 
G( P )er^( P ) 

with non-negative coefficients v(G(p)). 

Taking the matrix elements of the both sides of the above equation, we have 

w(n(p))= £ v(G(p))A(n(p),G(p)). (4.17) 

This is the statement that we wanted to show for XK-like anisotropic models. 

B. Contraction operation and proof of Lemma 1 

Now we prove Lemma 1. We consider two arbitrary elements G\ and G2 of T XY . Since 
A(Gi) and A(G 2 ) do not depend on a particular plaquette, as we discussed above, we assume 
that G\ is defined on p\ = l\ U l 2 and G2 on p 2 = l 2 U I3 (see Fig. 8). 

Using (3.27), we can express the matrix element of A(G2)A(Gi) as 

(n(/ 3 )|A(G 2 )A(G 1 )|n(/ 1 )) 2 = £ (n(/ 3 )|A(G 2 )|n(/ 2 )) 2 (n(/ 2 )|A(G 1 )|n(/ 1 )) 2 

n{i 2 ) 

= £ A(n( P2 ),G 2 )A(n( Pl ),G 1 ) 
n(i 2 ) 

= A Hk U l 2 U h), d U G 2 ). (4.18) 
n(i 2 ) 

Any cluster in the union G\ U G 2 is a single path because a vertex in li can only be an 
end-point of one and only one of edges in G\, a vertex in l 3 can only be in G 2 , and a vertex 
in l 2 is shared by an edge in G\ and another in G 2 . Therefore, the clusters (in this case, 
paths) have the following properties: 

Property 1 The vertices on the top link and on the bottom link (l\), which correspond 
to (3,2,/x) and (1,2,/z), are end-points of paths. They cannot be intermediate points. 
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FIG. 8. Two graphs d G T XY (p 1 ) and G 2 G T XY (p 2 ), and their contrac- 
tion resulting in a graph G with a factor two. The factor comes from the small 
loop in the right middle part of the graph on the left hand side. 

Property 2 The vertices on the middle link (l 2 ), which corresponds to (2,2,/z), are inter- 
mediate points of paths. They cannot be end-points. 

Property 3 If an edge belongs to the lower graph (G\), the adjacent edge(s) in the same 
path belongs to the upper graph (G 2 ) s and vice versa. 

These elemental properties lead to the following properties: 

Property 1' A temporal edge can appear only as the first or the last edge in a path. 

Property 2' A spatial edge can appear only as the intermediate edge in a path. 

From Property 3, the following properties are derived. 
Property 4 The length of any loop is even. 

Here, the length of a path is the number of its edges. Similar to Property 4, for an open 
spatial path we have the following property: 

Property 5 The length of any open spatial path is odd. 

This is because the first and the last edges in such an open path belong to the same graph 
(G\ or G 2 ). For the same reason, we have 
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Property 6 The length of any temporal path is even. 

Using (3.28), the matrix elements (4.18) can be expressed in terms of contributions from 
these paths as 

(n(Z 3 )|A(G a )A(G 1 )|n(Z 1 )) a = £ ]\A{n{Vp),P) = ]\f{P) (4.19) 

n{i 2 ) p p 

where P stands for a path, Vp is the vertex set of P, and f(P) is the contribution from the 
path P defined by 

/(P)= £ A(n(V P ),P). (4.20) 
n(v P m 2 ) 

Note that the paths P in (4.19) can be classified into three categories: 1) loops, 2) open 
spatial paths, and 3) open temporal paths. In what follows, we discuss contributions from 
these three types of paths separately. 

We first consider the contribution in (4.19) from loops, i.e., f(P) for a loop P. Because 
there are no end-points, all vertices are on the middle link l 2 (Property 1). Therefore, all 
edges are red because of the definition of Y xxz {jp). Hence, we can write the contribution 
from a loop P whose length is 2m (Property 4) in the following way: 

f(P) = E E • • • E %f . < ) • • • <J*« m) nf ), (4.21) 

p p p 

1 2 2m 

where nf is the value of the one-bit function n on the 2-th vertex in the path P. The product 
of the 6 functions is non-zero if and only if 

<=< = •••= < m . (4.22) 

Therefore, a contribution from a closed path is a mere numerical factor of 2, i.e., f(P) = 2. 

Next we consider the contribution from a spatial path. If the length of the path is one, 
the edge must be spatial and there is no intermediate vertex. Therefore, the contribution is 
6(nf,nf) and we will assume the length is larger than one. For such a path, the first and 
the last edges are temporal and all others are spatial, because of Properties 1 and 2. The 
contribution can then be written as 

f( P ) = EE'" E K U l > U 2)K n 2 ,^3)K^3 > n 4) ■ ■ ■ K n L-2^L-l)K^L-l,^L) 
P P p 

n 2 n 3 n 2m-l 

= *(«J- (4-23) 

Therefore, by tracing out the intermediate variables, this path is transformed into a red 
spatial edge in the resulting graph, whether the length is one or larger. 

Finally we consider the contribution from a temporal path. In this case, the first and 
the last edges are temporal as the previous case. All other intermediate edges are spatial. 
Since the length is even (Property 6) in this case, the contribution is 

f(P) = E E • • • E %f ,<)*(<,<)£(<,< ) • • • *(nL-i,nL)%L,nL + i) 

p p p 

2 o 2m 

= 6(n(,nf m+1 ). (4.24) 
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Hence the contraction results in a green vertical or green diagonal edge, i.e., a green temporal 
edge. In Fig. 8, we can see examples of the three types of contribution we have discussed. 

Thus we have shown that the right hand side of (4.19) is a product of three kinds 
of factors: 1) the numerical factor 2 m where m is the number of loops, 2) the factor which 
corresponds to antiferromagnetic edges, and 3) the factor which corresponds to ferromagnetic 
edges. Hence Lemma 1 is proven. 

In this subsection, we traced out the intermediate variables to obtain a new A-function 
with a multiplicative factor. As illustrated in Fig. 8, it can be done graphically by 1) erasing 
intermediate vertices, 2) replacing open paths by edges with the same color, and 3) replacing 
each loop by a factor 2. In what follows, we call this graphical operation a contraction. 

C. A numerical method to compute the coefficients v(G p ) 

The discussion based on the Taylor expansion that led (4.15) provides us not only a proof 
of Theorem 1 but also a numerical method for computing v(G) = lim^oo v^ n \G). What we 
have to do is 

1. Make a table of the multiplication rules among the elements of Y XY (p), i.e., a table 
which tells us which graph and what scaler multiplicative factor result from contraction 
of two arbitrary graphs in Y XY {jp). 

2. Compute the Taylor expansion of the exponential operator in (4.10) up to the ra-th 
order using the table. 

3. Compute the result of multiplication of the projection operators. Or, equivalently, 
symmetrize the expression obtained in the last procedure with respect to the vertex 
indices. 

Considering the fact that expansion series of an exponential function of bounded matrices 
such as (4.10) generally converges quickly, v( n \G(p)) in (4.15) should be good approximants 
for v(G(p)) with not too large n. We remark that the above procedure does not become 
combinatorially difficult as n increases. The amount of computation is only proportional to 
n. Therefore, this procedure to compute the coefficient is reasonably practical, although it 
may not be optimal. We also remark that the above procedure applies to other types of 
anisotropy to be now discussed. 

D. The isotropic cases (|A P | = 1) 

We now consider the special cases where X p = 1 or X p = — 1. These cases have already 
been considered in the last subsection. However, we take up these cases separately because 
the set of graphs we must consider for the expression (4.16) is truly smaller than T XY (p). 
This reduction means that the set of graphs needed in the labeling process of the simulation 
can be simpler than in the general case. It will also be clear in the next section that in 
these special cases the coefficient can even be computed analytically without resorting to 
the numerical method discussed above. For the isotropic cases, the sets of graphs T FH (p) and 
T AFH (p) play the same role as Y XY {jp) did for the XY"-like anisotropy. In what follows, we 
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call a graph which belongs to T FH (p) a special ferromagnetic graph, and one which belongs 
to r AFH (p), a special antiferromagnetic graph. 

We can show for the special ferromagnetic graphs that 

Lemma 2 T FH (p) is closed with respect to multiplication. 

It is sufficient to prove that the following property exists in the union graph defined on 
Pi U p 2 : 

Property 7 Given a graph defined on two plaquettes sharing a link, if all edges in a path 
are ferromagnetic, the path is also ferromagnetic. 

As we did in the last subsection, we consider two plaquettes stacked on top of each other. 
In the present case, we consider a special ferromagnetic graph defined on each plaquette. 
To show Property 7, it is sufficient to prove that we can obtain only temporal paths from 
ferromagnetic edges and that those paths are green. Because of Property 1', any possible 
path in the stacked special ferromagnetic graphs consists of two temporal edges: one in 
the lower graph and the other in the upper graph. Therefore, the resulting path must be 
temporal. Because these two edges are both green, the resulting path is also green. Thus, 
Lemma 2 is proven. As a corollary, we can easily show that (9 FH is closed with respect to 
multiplication. 

A statement similar to Lemma 2 holds for the antiferromagnetic graphs: 
Lemma 3 T AFH (p) is closed with respect to multiplication. 
This is equivalent to the following statement. 

Property 8 Given a graph defined on two plaquettes sharing a link, if all edges in a path 
are antiferromagnetic, the path is also antiferromagnetic. 

It is sufficient to prove that there are no diagonal or recurrent paths in the stacked special 
antiferromagnetic graphs and that all horizontal paths are red while all vertical paths are 
green. First, we assume that a path P is diagonal. Because all edges are either horizontal 
or vertical, we must then have an odd number of horizontal edges and two vertical edges 
in P. Therefore, the length of P must be odd. However, this contradicts Property 6 so 
there are no diagonal paths. Next, we assume that a path P is recurrent. With the same 
reason as above, the length of P must be even. However, this contradicts Property 5 so no 
recurrent edge can appear. As for the colors of the resulting path, horizontal ones must be 
red while vertical ones must be green because of Lemma 1 and the fact that T AFH C T* Y . 
Thus, Lemma 3 is proven. It follows that (9 AFH is closed with respect to multiplication. 

Because the graphs for the projection operators (4.12) belong to T FH (p), by using 
Lemma 2 and following the same line of the argument that proved Theorem 1, we can 
expand the operator p in terms of ferromagnetic A-operators: 

Theorem 2 The ferromagnetic operator p can be decomposed into a sum of A-operators of 
the form 

p= £ v(G(p))A(G(p)), (4.25) 
G(p)er™(p) 

with non-negative coefficients v(G(p)). 
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By the same argument, using Lemma 3, we have the following theorem for the antifer- 
romagnetic operator p: 

Theorem 3 The antiferromagnetic operator p can be decomposed into a sum of A- operator 
of the form 

p= £ v(G(p))A(G( P )), (4.26) 
G(p)er APH ( P ) 

with non-negative coefficients v(G(p)). 

E. The ferromagnetic Ising-like anisotropy (A > 1) 

In the case of Ising-like anisotropy, we cannot use expression (4.5) as a starting point, 
because the coefficient (1 — A) may be negative. Therefore, instead of (4.5), we consider the 
following identity with positive coefficients as the new starting point for the ferromagnetic 
case (A > 1): 

<,A + <X,„ + A <X,„ = -A + 2i(/z, v) + 2(A - 1)CV, v\ (4.27) 
where the new operator C(/x, u) is defined by 
(n{l 2 )\C{vL,u)\n{h)) 2 = 

£(™(2,a, M ), ^(l,a, M ))^(^(2,fc,^), ™(l,fe^))£(™(2,a, M ), ™(l,fe^))£(™(2,fe,^), ™(l,a,^)) 

X JJ ^(2,a,a),?T-(l,a,a)) II < K n (2,fe,/3)> ™(l,b,/3))- (4.28) 

The graph C(p,,v) which expresses this operator is shown in Fig. 7(c). With this graph, 
we can simply write C(p,,v) = A(C(/x, u)). Obviously, the new graph does not correspond 
to any graph in T XY (p) because some of vertices are shared by multiple edges. However, 
it still belongs to Y xxz {jp). In fact, A(p,,v) and C(p,,v) both belong to T F (p) defined in 
(3.37). In what follows, we call such a graph ferromagnetic. We also call the corresponding 
A-operator ferromagnetic. The difference from is that for there is no restriction on 
the number of edges sharing a vertex. 

For the ferromagnetic graphs, we can show the following statement: 

Lemma 4 T F (p) is closed with respect to multiplication. 

We will consider the contraction of these operators. We first take two ferromagnetic graphs 
stacked on top of one another. This time a cluster is not necessarily a path. In addition, 
not all the paths satisfy the Property 3. We will call paths that satisfy the Properties 1, 2 
and 3 alternating paths. For alternating path all the properties (Properties 1-8) holds since 
Properties 4-8 are derived from Properties 1-3. In particular, Property 7 holds also for these 
alternating paths. It is important to notice that any ferromagnetic graph is equal to the 
union of all the alternating paths in it. Since the alternating paths overlap each other in 
the present case, additional consideration is needed to find the contribution from the paths 
when they are contracted. The result is, however, the same: a ferromagnetic alternating 
path contracts to a ferromagnetic edge. 

To see this, we resort to an example rather than rigorous argument. The example is 
shown in Fig. 9. 
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FIG. 9. An example of contraction of two ferromagnetic graphs. 



In Fig. 9(a), a typical graph is shown in the case of S = 1. There are three clusters in the 
graph. The result of the contraction is merely the product of corresponding three factors. 
Since the left and the right clusters' contributions are trivial, we focus on the middle cluster 
whose vertex set is {f , Vi, v 2 , f 3 , t> 4 }. There are four alternating paths in this cluster. Their 
edge sets are {ei, e 3 }, {ei, e 4 }, {e 2 , e 3 } and {e 2 , e 4 }. Correspondingly, the contraction of this 
cluster results in 

^2 S(n , ni)5(n , n 2 )6(n , n 3 )6(n , n 4 ) 

no 

= X)[^( ni » n o)6(n , n 3 )] [S(ni,n )S(n 0) n 4 )] [S(n 2 ,n )S(n , n 3 )] [6(n 2 ,n )6(n ,n 4 )] 

no 

= £(ni,n 3 )£(ni,n 4 )£(n2,n 3 )£(n2,n 4 ), (4.29) 

where rii = n Vi . The last expression in the above equation is represented by Fig. 9(b). It is 
easy to extract essential points from this example and construct a rigorous proof. Thus we 
obtain Lemma 4. 

Combining Lemma 4 and the expression (4.27), we obtain the following theorem: 
Theorem 4 For A greater than or equal to 1, we can expand the operator p in the form 

~ P = £ v(G(p))A(G( P )), (4.30) 
G( P )erP(p) 

with non-negative coefficients v(G(p)). 
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We can compute the coefficients v(G(p)) numerically in this case, too, by the procedure 
described in Subsection IV C. The first step may seem impossible to do in the present case 
where we have infinitely many distinct graphs. However, even if the number of graphs is 
infinite, it is still possible if the number of graphs which correspond to distinct A-operators 
is finite. In such a case, by identifying all the distinct graphs which give the same operator, 
we can keep the first step of the procedure manageable. In fact, the number of distinct A- 
operators is finite in the present case. Therefore, the procedure discussed in Subsection IV C 
can apply. 

F. The antiferromagnetic Ising-like anisotropy (A < —1) 

In this case, we start with the following identity. 

+ + X <A = x + 2 ^ ") + 2 (" A - ")> ( 4 - 31 ) 

where the new operator D(/j,, u) is defined by 
{n{l 2 )\b{^v)\n{h)) 2 = 

5(n(2,a,n), rc(l,a,n))£(rc(2,b,i>), ^(l,b,v))^(^(l,a,n), n{\,b,v))t>{n{2,b,v), ™(2,a, M )) 

X JJ £(™( 2 ,a,a),™(l,a,a)) II < K n (2,fe,/3)> ™(l,b,/3))- (4.32) 

The graph which expresses this operator is shown in Fig. 7(d). 

In the last subsection, we defined T F (p) by removing the restriction on the multiple 
occupation of vertices from T FH (p). Similarly, in the present case, we consider a graph 
which consists of an arbitrary number of antiferromagnetic edges. Every vertex must be an 
end-point of one or more edges. We call this type of graph antiferromagnetic and represent 
the set of the antiferromagnetic graphs by T AF (p). For the antiferromagnetic operators, we 
have the following lemma: 

Lemma 5 T^ F is closed with respect to multiplication. 

The proof can be done in the same way as the one in the last subsection. In other words, 
using Property 8, we can show that all the alternating paths in the graph G\ U G 2 are 
antiferromagnetic. The contribution from each alternating path can be represented by an 
antiferromagnetic edge, unless the path is a loop. If the path is a loop which is connected 
to an open path, it contributes a factor 1. If the path is a loop which is not connected 
to any open path, it contributes a factor 2 together with all other loops connected to it. 
Therefore, contraction results in antiferromagnetic edges and numerical factors. Thus follows 
the lemma. 

Combining Lemma 5 and the expression (4.31), we obtain the following theorem: 
Theorem 5 For A smaller than or equal to —1, we can expand the operator p in the form 

~ P = £ v{G{p))k{G{p)\ (4.33) 
G(p)er**(p) 

with non-negative coefficients v(G(p)). 
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V. HEISENBERG MODELS 



In the isotropic cases, i.e., |A P | = 1, we can obtain compact formulae for the coeffi- 
cients v(G(p)). To obtain this formula, we have to examine the nature of the coefficients in 
more detail. As we did in Subsection II D, we can classify local configurations into several 
categories by using symmetry properties. The function w(n(p)) has various symmetries: 
invariance under 1) the vertical and horizontal mirror-image transformation, 2) the simul- 
taneous flipping of all spins, and 3) the permutation of vertices within each site. We define 
classes of configurations n(p) in such a way that two configurations n(p) and n'(p) belong 
to the same class if and only if they can be transformed into each other by these symmetry 
transformations. We use symbol S{jp) to specify such a class, which is a special set of states. 
For the same reason, we expect that the solution v(G(p)) possesses the same symmetry, i.e., 
v(G'(p)) = v(G(p)) if a graph G'(p) can be obtained from G(p) through these symmetry 
transformations. Therefore, it is convenient to define classes of graphs. We say that G'(p) 
belongs to the same class as G(p) if and only if G(p) can be transformed into G'(p) through 
the symmetry transformations. A class is a subset of Y xxz {jp) and denoted by the symbol 
g(p). Note that 

U g( P ) = T*(p), (5.i) 

C( P )cr* 

where * stands for l XY\ 'FH', 'AFH','F' or 'AF'. Taking these definitions into account, we 
can reduce (3.30) to 

™{s{p))= £ N(s(p),g(p))i(g( P )), (5.2) 

ecr*( P ) 

where 



w(S(p)) = w(n(p)), n(p) e S(p), (5.3) 
v{Q{jp)) = v{G{p)l G{p) e g( P ), (5.4) 

and 

N(S(p),g(p))= £ A(n(p),G(p)), n(p)eS(p). (5.5) 

G(p)eO( P ) 

We first note that the class of graphs g(p) is characterized by 16 integers, mx 1 -x 2 
(Xi,X 2 = bl, br, tl, and tr). The integer mx 1 -x 2 1S the number of edges which connect 
vertices in the site .^(p) to vertices in the site sx 2 (p)- For example, rau-tr is the number of 
vertical edges which connects a vertex in the bottom-left site and a vertex in the top-right 
site. In the case of ferromagnetic Heisenberg model, these integers vanish except for mu-ti, 
rau-tr-, mbr-ti an d mbr-tr- Since there are constraints 

mu-ti + mu-tr = mbr-ti + mbr-tr = mu-ti + mbr-ti = mu-tr + mbr-tr = 25, (5.6) 

only one of the above four numbers is independent. Therefore, we take ra^-ti as the repre- 
sentative. Thus, we can simply express the coefficient v(g(p)) as 



38 



v(G(p)) = v(m b i-ti)- 



(5.7) 



Next, we notice that a class of configurations is characterized by four integers, m{,/ , , m t / 
and m tr , where mj; is the sum of vertex variables at the bottom-left site, and the other three 
are similarly defined. This time, three of them are independent. We express the local weight 
function as in Subsection II D: 



w(S(p)) = w 



m t i m tr 
m b i m br 



(5.8) 



Using this notation, we can rewrite (4.25) as 



w 



m t i m tr 
m b i m br 



"lbl-tl V 



m t i m tr 
m b i m br 



"T-bi-ti v{m h i-ti 



(5.9) 



We now consider a configuration which belongs to a class characterized by 
(m t i, m tr , m b i, m br ) = (m, 2S — m, 0, 2S). If a graph G is compatible with this configuration, 
all m occupied vertices in the top-left site must be connected to vertices in the bottom-right 
site because all edges in the graph are temporal and green. For a similar reason, all 2S — m 
occupied vertices in the top-right site must be connected to those in the bottom-right site. 
Therefore, we can easily see that the function N defined above must have the following 
property: 



FH 



m 2S — m 
2S 



m b i-ti 



( ((2S)!) 2 , 



if m b i-ti = 
otherwise. 



m, 



(5.10) 



Having this equation, we can derive from (5.9) the following equation which determines i>fh 
completely. 



w 



m 2S — m 
2S 



((25)! 



) 2 v(m) 



(5.11; 



Using the identity (3.18), we finally get 



v[m) 



((25)!) 



'2S^ 



m 



((m, 2S - m\ exp(-Ar^)|0, 2S)). 



(5.12) 



ArTiij), a matrix 



Therefore, the problem has been reduced to the diagonalization of exp 
in (2S + 1) dimensions, which is numerically simple and also can be done rigorously by 
various symbolic computational languages. 

Now, we will discuss the isotropic antiferromagnetic case. In this case, too, the class of 
graphs Q{p) is characterized by four integers, m b i-ti, mfc-tr, mbi-br and m t /_ tr .. As in the case 
of ferromagnetic Heisenberg model, only one of the above four numbers are independent. 
We express the coefficient v(Q(p)) as 



v{G{p)) = v(m U -ti) 



(5.13) 



Equation (4.26) is rewritten as 
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w 



m tl m tr 
m b i rribr 



E * 



AFH 



™-bl-tl 



m tl m tr 
m b i m br 



m b i-ti v{m b i_ t i 



(5.14) 



We next consider a configuration which belong to a class characterized by (m t i, m tr , 
T^bij mbi) = (0,m,m, 0). In this case, in any graph compatible to this configuration, all the 
m occupied vertices in the bottom-left site must be connected to vertices in the bottom-right 
site because we can not connect them by green edges to the top-left site where there are no 
occupied vertices. Hence, we obtain the following property of N^fh- 



N 



AFH 




Accordingly, we obtain 



mu-ti 



1 /2S> 



| ((2^)!) 2 



if m b i- t i = 2S 
otherwise. 



m, 



(5.15) 



v[m) 



((25)!) 



m 



((0,m|exp(-ATW iiJ -)|m,0)). 



(5.16) 



This is the generalization of (2.62). 



VI. ERGODICITY 

Here we call an algorithm ergodic if an arbitrary state can be reached with a non-zero 
probability within a finite number of Monte Carlo steps regardless of the initial state. The 
ergodicity and the detailed balance condition are two important features that any algorithm 
must possess. In the main text of this paper, we focused on the detailed balance condition 
and left out the discussion of ergodicity. It is sometimes non-trivial to show that an algorithm 
possesses this property. For example, the ergodicity of conventional algorithms has not yet 
been proved as far as we know. However, for the new algorithm presented in this paper, it 
is almost straightforward to prove that ergodicity holds in the case of the ferromagnetic and 
Xy-like models, as we will see below. In the antiferromagnetic models, we can show that 
the new algorithm is ergodic if the conventional algorithm is. 

The main part of conventional worldline algorithms is local loop flips where size and 
shape of the loops are fixed. In order to achieve the ergodicity, the shape and the location 
must be chosen carefully. Even if we choose them properly, however, using only the local 
movements does not constitute an ergodic algorithm because some quantities are conserved 
by those movements in the conventional algorithm [20]. Global winding numbers are such 
conserved quantities. Therefore, we have to include several different kinds of global updates 
that can change the winding numbers. What local updates and global updates exactly we 
should include depends on the model. We always have to face this annoying question as 
long as we use the conventional algorithms. In fact, this difficulty also makes the actual 
computer programs complicated in order to accommodate different kinds of procedures. 

Now we prove the ergodicity of the new algorithm in the ferromagnetic and XK-like 
cases. To this end, we consider an arbitrary worldline configuration. If we note that any 
worldline consists of vertical line segments and diagonal ones, it is easy to realize that in a 
labeling process a loop configuration can be generated with a finite probability in such a way 
that any worldline in the initial state coincides one of the loops in the loop configuration. 
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(Of course this is possible only in the case of ferromagnetic or XK-like systems where graphs 
with diagonal edges can be chosen.) Once such loops are formed, in the subsequent flipping 
process, it can happen that all existing worldlines vanish (flip) and no new worldlines are 
created. The outcome is the vacuum state. Thus, within one Monte Carlo step, the vacuum 
state can be reached from an arbitrary state. The inverse process can also take place with a 
finite probability, i.e., the transition from the vacuum state to an arbitrary state. Therefore, 
we can conclude that every state can be reached from any state within two Monte Carlo 
steps via the vacuum state. Thus, the ergodicity is proven. 

This proof was possible because vertical and diagonal segments appear in the graphs 
corresponding to the operators / and A in the decomposition (4.5) and (4.27). Since the 
decomposition (4.31) does not contain the operator A, the ergodicity of the antiferromagnetic 
model is not obvious. However, we can argue that if conventional algorithms are ergodic, 
the new algorithm is also ergodic. We can see this simpy by noticing that most global 
flips introduced in the conventional algorithms can happen in the new algorithm with finite 
probability. To be more specific, the n-direction global flips in [20] are simply flipping of 
loops with the temporal winding number 1, and the x-direction global flips are those of loops 
with spatial winding number 1. Both types of loops can form in the new algorithm. The 
t-direction global flips are equivalent to flipping of loops whose temporal and spatial winding 
numbers are 1, as far as the effect on the global winding numbers is concerned. Again, this 
type of loops can form in the new algorithm. 

Here, we emphasize that these global movements are intrinsically included in the new 
algorithm and they do not introduce any additional complication into the algorithm. In other 
words, we can achieve the ergodicity in a simple and systematic way in the new algorithm 
in contrast to the conventional algorithms. 

VII. SUMMARY 

In this paper, we described how the Boltzmann factor of the lattice quantum spin model 
can be decomposed into sum of terms each of which corresponds to a graph. Based on this 
decomposition, we have shown that a non-trivial cluster algorithm exists for any quantum 
spin model which can be described by this Hamiltonian, regardless of geometrical properties 
of the original lattice such as the number of dimensions, the boundary condition, and of 
the range of the interactions. This decomposition also determines how to choose the proper 
set of local graphs and how to assign one of them to each plaquette. The new algorithm 
is advantageous for several reasons: 1) it may reduce the autocorrelation time drastically, 
2) with the improved estimators [19], it can reduce variances of distribution functions of 
important physical quantities and therefore reduce the statistical error, 3) it can achieve 
ergodicity without introducing any ad hoc global updates, and 4) the resulting computer 
programs can be simpler than those for the conventional algorithms. For some cases, such as 
the Ising model and the S = 1 antiferromagnetic, we explicitly gave the labeling probabilities 
that determine the Monte Carlo algorithm. For the general case, a method for computing 
the labeling probabilities numerically was presented. 

Our representation can be viewed as the extension of the FK cluster representation of the 
Ising model. Therefore, the resulting Monte Carlo algorithms are generalizations of the SW 
algorithm to the quantum spin problems. As we showed in [7], the present scheme converges 



41 



to SW algorithm in the limit of strong Ising-like anisotropy. The present algorithm also 
includes the loop algorithm for the S = 1/2 model [15]. 

It is remarkable that in contrast to the conventional algorithm the cluster algorithm must 
be quite different depending on the anisotropy of the model. For the XK-like anisotropy, 
we can have a loop algorithm. A loop algorithm is advantageous because we can identify 
loops in a computational time proportional to the number of vertices N v = 2SM\L\. For 
the Ising-like anisotropy, we inevitably create clusters with branching. In this case, the 
computational time is proportional to N v log(N v ). 

Although the efficiency of the algorithms has not yet been demonstrated systematically, 
some encouraging facts are already known. The efficiency of the SW algorithm near a critical 
points is well-known. Other examples are the efficiency of the algorithms for the S = 1/2 
antiferromagnetic Heisenberg model [16] in two dimensions and S = 1 antiferromagnetic 
Heisenberg model in one dimension [7]. In the latter, the autocorrelation time was at least 
three orders of magnitude smaller than that of the conventional method. The difference 
tends to be greater for large Trotter numbers. We also found that the present algorithm 
dramatically reduces the autocorrelation time of S = 1/2 XY model in two dimensions, 
which is equivalent to hard-core boson model. This result will be published elsewhere [21]. 
Based on these findings, we believe that for the homogeneous XXZ spin model without a 
symmetry breaking field such as a magnetic field, the present algorithm provides a more 
efficient alternative to the standard local Metropolis algorithm. On the other hand, we 
know much less about the efficiency of the algorithm for disordered system and systems 
with symmetry breaking fields. As for the disorder, it is known [3] that the SW algorithm 
is not advantageous for random-bond Ising model, i.e., the Edwards- Anderson spin glass 
model. As for the effect of the symmetry breaking field, we empirically found in the case 
of the S = 1/2 models in two dimensions that the algorithm works well for the Ising model 
regardless of the strength of a magnetic field whereas it does not for the XY model when 
the field in the z-direction is strong. More work is needed for these cases. 
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APPENDIX: THE COMPLETELY ANISOTROPIC CASE — THE XYZ MODEL 

Since the case where J x = J y is much more frequently studied than completely asym- 
metric case where J x ^ J y ^ J z ^ J x , we focused on the XXZ model in the main text 
to avoid too much complication. However, the XY Z model can be treated in a fashion 
similar to that presented for the XXZ model. In this appendix, we give a brief outline of 
the algorithm for the XYZ model. A more detailed discussion will be given elsewhere [22]. 

We first note that 
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\K 4 





K 2 
K 3 






K 3 
K 2 




# 4 \ 





^K u K 2 ,K s ,K t (V,v) (Al) 



where 



K X = K Q ^K Z , K 2 = K -K Z 



K x + i*T y , and if 4 = K x - K. 



v 



(A2) 



Here the constant K is irrelevant, i.e., its value does not affect the result at all. We 
introduced it for appearance. We also define wk 1 ,k 2 ,k 3 ,k 4 as the absolute value of the matrix 
element of p which is defined by (3.12) and (3.13) with A(/x, u) replaced by A.K 1 ,K 2 ,K 3 ,K i {li', f )• 
We then can show, by almost the same argument as we presented to show (3.16), that 

WK 1 ,K 2 ,K 3 ,K i = WK 1 ,K 2 -K 3 ,K i = WK 1 ,K 2 ,K 3 -K i = WK 1 ,K 2 -K 3 -K i - (A3) 

This generalizes (3.16). Because of (A3) we can assume non-negative values for K 3 and 
without loss of generality. We can also assume non-negative Ki and K 2 because we can 
take as large a value as we want for the irrelevant constant K without changing final result. 
Therefore, we assume that all the constants are non-negative. 

For the XY Z model, we consider ten types of graphs (Fig. 10) instead of the four types 
in Fig. 7 for the XX Z model. 




FIG. 10. Ten types of graphs for the decomposition of local Boltzmann 
factor of the XY Z model. For clarity, two green vertical edges which connect 
leftmost vertices and rightmost vertices in each graph are not drawn. 
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We denote these graphs as G x {^,v) where X = (1), (2), (3), (4), (12), (13), (14), (23), (24), 
and (34) as indicated in Fig. 10. Here, we omit the index p specifying a plaquette. We 
note that the graphs for the four types of operators for the XX Z model discussed can be 
rewritten in this notation as 

A(^) = G^W), 
S(^) = G( 2 ' 3 W), 

The decomposition of the Hamiltonian which is analogous to (4.5), (4.27) and (4.31) is 

AK u K 2 ,K 3 , Ki (v, v) = E «xA(G x (/x, u)), (A4) 

X 

where the undetermined variables ax must be non-negative. There is at least one non-trivial 
solution to this equation. In fact there are many in general. Which solution gives the most 
effective algorithm has not been studied extensively. The more detailed discussion about 
the solutions of (A4) will be presented elsewhere. 

We should point out here that the matrix element of (Al) can be viewed as the vertex 
weight of an eight-vertex model with symmetry with respect to simultaneous inversion of 
all arrows. Therefore, the solution of the equation (A4) gives us a cluster algorithm of an 
eight-vertex model as well as the foundation for the cluster algorithm of the XYZ model. 
A similar remark applies to the XX Z model where 5 = 1/2 problem was special case of the 
six- vertex model. 

Here we will only give an relatively simple but useful example of the solution of (A4) 
instead of listing all possible solutions. We consider the homogeneous system where the cou- 
pling constant does not depend on the location. We also assume, without loss of generality, 
that 

\K z \>\K x \>\K y \>0. (A5) 

(If this is not the case we can 'rotate' the space of spins so that the above inequality holds.) 
We consider only the case where K z > 0. We can do this without the loss of generality when 
the original lattice is bipartite. Furthermore, because of (A3) and (A5), we can assume that 
K x > without loss of generality. With these assumptions, we obtain a solution 

a x = K x - K 2 - K 3 - K 4 = 2(K Z - K x ), 

«(12) = #2 = 0, 

a ( i3) = K 3 = K x + K y , 
a (i4) = K 4 = K x — Ky. 

Correspondingly, (A4) becomes 

(^u) = 2(K z -K x )A(G^\^u)) 
+(K X + K y )k{G^\^ v)) + (K x - K y )k{G^\^ v)). (A6) 
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Here we have chosen K = K z . 

In Section IV, we gave an graph-theoretical argument to show the closure of various sets 
of graphs with respect to multiplication. By a similar argument, we can show that Tp XYZ is 
closed with respect to multiplication. Here Tp XYZ is defined by 

r FXYZ (p) = {G\V G =p, and "All edges in G are green."}. (A7) 

We note that 

T FXYZ (p) n T xxz (p) = T F (p). (A8) 

Following the same line of argument as the one in the main text, we can conclude that a set 
of non-negative variables exists that satisfies 

~PK u K 2 ,K 3 , Ki = E «(G(p))A(G(p)). (A9) 
G( P )erPXYZ( p) 

This, of course, leads to a cluster algorithm for the XYZ model with K z > K x > \K y \ > 
0. It is straightforward to calculate v(G(p)) analytically for small values of S. We note 
that one useful application of this type of algorithm is the one to the XY model with the 
x-representation basis, i.e., the representation basis in which the X- components of spins 
are diagonalized. This is useful because by this representation we can calculate two-point 
correlation between in-plane spin components. This correlation may be more interesting to 
study than correlation between out-of-plane components which can be calculated with the 
^-representation basis. 
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